CoCoS: Fast and Accurate Distributed Triangle Counting in Graph Streams
نویسندگان
چکیده
Given a graph stream, how can we estimate the number of triangles in it using multiple machines with limited storage? Specifically, should edges be processed and sampled across for rapid accurate estimation? The count (i.e., cliques size three) has proven useful numerous applications, including anomaly detection, community link recommendation. For triangle counting large dynamic graphs, recent work focused largely on streaming algorithms distributed but little their combinations “the best both worlds.” In this work, propose CoCoS , fast algorithm estimating counts global all triangles) local incident to each node. Making one pass over input carefully processes stores so that redundant use computational storage resources is minimized. Compared baselines, is: (a) accurate: giving up smaller estimation error; (b) : {10.4\times faster, scaling linearly stream; (c) theoretically sound yielding unbiased estimates.
منابع مشابه
DiSLR: Distributed Sampling with Limited Redundancy For Triangle Counting in Graph Streams
Given a web-scale graph that grows over time, how should its edges be stored and processed on multiple machines for rapid and accurate estimation of the count of triangles? e count of triangles (i.e., cliques of size three) has proven useful in many applications, including anomaly detection, community detection, and link recommendation. For triangle counting in large and dynamic graphs, recent...
متن کاملFURL: Fixed-memory and Uncertainty Reducing Local Triangle Counting for Graph Streams
How can we accurately estimate local triangles for all nodes in simple and multigraph streams? Local triangle counting in a graph stream is one of the most fundamental tasks in graph mining with important applications including anomaly detection and social network analysis. Although there have been several local triangle counting methods in a graph stream, their estimation has a large variance ...
متن کاملContinuous Distributed Counting for Non-monotonous Streams
We consider the continual count tracking problem in a distributed environment where the input is anaggregate stream originating from k distinct sites and the updates are allowed to be non-monotonous, i.e. both incre-ments and decrements are allowed. The goal is to continually track the count within a prescribed relative accuracyat the lowest possible communication cost. Specifically...
متن کاملFast, accurate call graph profiling
Existing methods of for call graph profiling, such as that used by gprof, deal badly with programs that have shared subroutines, mutual recursion, higher-order functions, or dynamic method binding. This article discusses a way of improving the accuracy of a call graph profile by collecting more information during execution, without significantly increasing the overhead of profiling. The method ...
متن کاملA second look at counting triangles in graph streams
In this paper we present improved results on the problem of counting triangles in edge streamed graphs. For graphs with m edges and at least T triangles, we show that an extra look over the stream yields a two-pass streaming algorithm that uses O( m ǫ4.5 √ T ) space and outputs a (1 + ǫ) approximation of the number of triangles in the graph. This improves upon the two-pass streaming tester of B...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Knowledge Discovery From Data
سال: 2021
ISSN: ['1556-472X', '1556-4681']
DOI: https://doi.org/10.1145/3441487